Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora
نویسندگان
چکیده
Much of scientific progress stems from previously published findings, but searching through the vast sea of scientific publications is difficult. We often rely on metrics of scholarly authority to find the prominent authors but these authority indices do not differentiate authority based on research topics. We present Latent Topical-Authority Indexing (LTAI) for jointly modeling the topics, citations, and topical authority in a corpus of academic papers. Compared to previous models, LTAI differs in two main aspects. First, it explicitly models the generative process of the citations, rather than treating the citations as given. Second, it models each author’s influence on citations of a paper based on the topics of the cited papers, as well as the citing papers. We fit LTAI into four academic corpora: CORA, Arxiv Physics, PNAS, and Citeseer. We compare the performance of LTAI against various baselines, starting with the latent Dirichlet allocation, to the more advanced models including author-link topic model and dynamic author citation topic model. The results show that LTAI achieves improved accuracy over other similar models when predicting words, citations and authors of publications.
منابع مشابه
A Contrastive Investigation of Intertextuality in Research Articles Authored by Iranian vs. English Writers in Applied Linguistics
Academic discourse enables others' voices in a text to be realized through conventionalized citational patterns. However, form amongst a variety of factors, one thing which may influence the way others' voices are textualized is writers' affiliations to different cultures. Following this assumption, the present contrastive study attempted to explore manifest intertextual constructions across th...
متن کاملComparative Study of the Academic Vocabulary Content of Electronic Engi-neering Corpora, GE Materials and M.S. Entrance Examinations
The importance of vocabulary learning has been underlined in the field of English for Academic Purposes (EAP) because non-English majors who require reading English texts in their fields of study have to expand their English vocabulary knowledge much more efficiently than ordinary ESL/EFL learners. Since academic vocabulary instruction in Iranian universities is realized through the use of Gene...
متن کاملTopic Cropping: Leveraging Latent Topics for the Analysis of Small Corpora
Topic modeling has gained a lot of popularity as a means for identifying and describing the topical structure of textual documents and whole corpora. There are, however, many document collections such as qualitative studies in the digital humanities that cannot easily benefit from this technology. The limited size of those corpora leads to poor quality topic models. Higher quality topic models ...
متن کاملDiscovering Hidden Topical Hubs and Authorities in Online Social Networks
Finding influential users in online social networks is an important problem with many possible useful applications. HITS and other link analysis methods, in particular, have been often used to identify hub and authority users in web graphs and online social networks. These works, however, have not considered topical aspect of links in their analysis. A straightforward approach to overcome this ...
متن کاملA Comparative Study of Academic impact and Wikipedia Ranking
In addition to its broad popularity Wikipedia is also widely used for scholarly purposes. Many Wikipedia pages pertain to academic papers, scholars and topics providing a rich ecology for scholarly uses. Although many recognize the scholarly potential of Wikipedia, as a crowdsourced encyclopedia its authority and quality is questioned due to the lack of rigorous peer-review and supervision. Sch...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- TACL
دوره 5 شماره
صفحات -
تاریخ انتشار 2017